Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement

نویسندگان

چکیده

The key advantage of using multiple microphones for speech enhancement is that spatial filtering can be used to complement the tempo-spectral processing. In a traditional setting, linear (beamforming) and single-channel post-filtering are commonly performed separately. contrast, there trend towards employing (DNNs) learn joint non-linear filter, which means restriction processing model separate information potentially overcome. However, internal mechanisms lead good performance such data-driven filters multi-channel not well understood. Therefore, in this work, we analyse properties filter realized by DNN as its interdependency with temporal spectral carefully controlling sources (spatial, spectral, temporal) available network. We confirm superiority model, outperforms an oracle challenging speaker extraction scenario low number 0.24 POLQA score. Our analyses reveal particular should processed jointly increases selectivity filter. systematic evaluation then leads simple network architecture, state-of-the-art architectures on task 0.22 score 0.32 CHiME3 data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Neural Network Approach for Single Channel Speech Enhancement Processing

..................................................................................................................................... ii Acknowledgements .................................................................................................................. iii Table of contents .............................................................................................................

متن کامل

Multi-channel psychoacoustically motivated speech enhancement

Multichannel techniques offer advantages in noise reduction and overall output signal quality when compared to the well studied mono approaches. In this paper we present an original multichannel psychoacoustically motivated noise reduction algorithm that naturally extends the single channel psychoacoustic masking filter previously studied in the literature [1]. The optimality criterion is desig...

متن کامل

Multi-Modal Hybrid Deep Neural Network for Speech Enhancement

Deep Neural Networks (DNN) have been successful in enhancing noisy speech signals. Enhancement is achieved by learning a nonlinear mapping function from the features of the corrupted speech signal to that of the reference clean speech signal. The quality of predicted features can be improved by providing additional side channel information that is robust to noise, such as visual cues. In this p...

متن کامل

Utilizing Kernel Adaptive Filters for Speech Enhancement within the ALE Framework

Performance of the linear models, widely used within the framework of adaptive line enhancement (ALE), deteriorates dramatically in the presence of non-Gaussian noises. On the other hand, adaptive implementation of nonlinear models, e.g. the Volterra filters, suffers from the severe problems of large number of parameters and slow convergence. Nonetheless, kernel methods are emerging solutions t...

متن کامل

Improved Multi-band Spectral Subtraction Method for Speech Enhancement

In this paper, we propose a new approach to improve the performance of speech enhancement technique based on multi-band spectral subtraction for white Gaussian noise. First, the original power spectral subtraction and multiband spectral subtraction methods are surveyed and implemented. Next, the generalization is applied on multiband spectral subtraction. Finally, the flattened noise spectrum w...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2023

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2022.3221046